Effectively Finding Relevant Web Pages from Linkage Information

نویسندگان

Jingyu Hou

Yanchun Zhang

چکیده

This paper presents two hyperlink analysis-based algorithms to find relevant pages for a given Web page (URL). The first algorithm comes from the extended cocitation analysis of the Web pages. It is intuitive and easy to implement. The second one takes advantage of linear algebra theories to reveal deeper relationships among the Web pages and to identify relevant pages more precisely and effectively. The experimental results show the feasibility and effectiveness of the algorithms. These algorithms could be used for various Web applications, such as enhancing Web search. The ideas and techniques in this work would be helpful to other Web-related researches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Artificial Bee Colony (ABC) Approach for Ranking Web Pages

The World Wide Web (WWW) is rapidly growing on all aspects and is a massive, explosive, data resource in the world. In information retrieval approach web Search engines are predominant tools for finding and getting access to the contents of web. The primary goal of Search engine is to provide relevant information to the users according to their needs, Usually Search engines gives large result s...

متن کامل

Enhancing Web Search through Query Expansion

Web search engines help users find relevant web pages by returning a result set containing the pages that best match the user’s query. When the identified pages have low relevance, the query must be refined to capture the search goal more effectively. However, finding appropriate refinement terms is difficult and time consuming for users, so researchers developed query expansion approaches to i...

متن کامل

Information Retrieval Issues on the World Wide Web

The World Wide Web (Web) is the largest information repository containing billions of interconnected documents (called the web pages) which are authored by billions of people and organizations. The Web is huge, diverse, unstructured or semi structured, dynamic contents, and multilingual nature; make the effectively and efficiently searching information on the Web a challenging research problem....

متن کامل

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

Analyzing new features of infected web content in detection of malicious web pages

Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

IEEE Trans. Knowl. Data Eng.

دوره 15 شماره

صفحات -

تاریخ انتشار 2003

Effectively Finding Relevant Web Pages from Linkage Information

نویسندگان

چکیده

منابع مشابه

Artificial Bee Colony (ABC) Approach for Ranking Web Pages

Enhancing Web Search through Query Expansion

Information Retrieval Issues on the World Wide Web

Prioritize the ordering of URL queue in Focused crawler

Analyzing new features of infected web content in detection of malicious web pages

عنوان ژورنال:

اشتراک گذاری